Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Emotion recognition model based on hybrid-mel gama frequency cross-attention transformer modal
Mu LI, Yuheng YANG, Xizheng KE
Journal of Computer Applications    2024, 44 (1): 86-93.   DOI: 10.11772/j.issn.1001-9081.2023060753
Abstract243)   HTML9)    PDF (1891KB)(130)       Save

An emotion recognition model based on Hybrid-Mel Gama Frequency Cross-attention Transformer modal (H-MGFCT) was proposed to address the issues of effectively mining single modal representation information and achieving full fusion of multimodal information in multimodal sentiment analysis. Firstly, Hybird-Mel Gama Frequency Cepstral Coefficient (H-MGFCC) was obtained by fusing Mel Frequency Cepstral Coefficient (MFCC) and Gammatone Frequency Cepstral Coefficient (GFCC), as well as their first-order dynamic features, to solve the problem of speech emotional feature loss; secondly, a cross modal prediction model based on attention weight was used to filter out text features more relevant to speech features; subsequently, a Cross Self-Attention Transformer (CSA-Transformer) incorporating contrastive learning was used to fuse highly correlated cross modal information of text features and speech modal emotional features; finally, the cross modal information features containing text and speech were fused with the selected text features with low correlation to achieve information supplement. The experimental results show that the proposed model improves the accuracy by 2.83, 2.64, and 3.05 percentage points compared to the weighted Decision Level Fusion Text-audio (DLFT) model on the publicly available IEMOCAP (Interactive EMotional dyadic MOtion CAPture), CMU-MOSI (CMU-Multimodal Opinion Emotion Intensity), and CMU-MOSEI (CMU-Multimodal Opinion Sentiment Emotion Intensity) datasets, verifying the effectiveness of this model for emotion recognition.

Table and Figures | Reference | Related Articles | Metrics